Jarque–Bera normality test (jarque_bera)#

The Jarque–Bera (JB) test is a classic normality test built around two facts that must hold if data are normally distributed:

  • Skewness should be 0 (symmetry)

  • Kurtosis should be 3 (tail thickness; equivalently excess kurtosis should be 0)

It’s often used as a quick check (especially on model residuals) to see whether the Gaussian assumption is plausible.


Learning goals#

  • Understand what the JB statistic measures (skewness + kurtosis departures from normal)

  • Implement the test from scratch with NumPy only

  • Interpret the statistic and p-value correctly

  • Build intuition via Plotly visuals (histograms, Q–Q plots, and the null distribution)

Prerequisites#

  • Sample mean/variance and central moments

  • Basic hypothesis testing (null vs alternative, p-values)

  • The chi-square distribution (JB is asymptotically χ² with 2 degrees of freedom)

import numpy as np

import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

np.set_printoptions(precision=6, suppress=True)
rng = np.random.default_rng(42)

VERSIONS = {"numpy": np.__version__, "plotly": plotly.__version__}
VERSIONS
{'numpy': '1.26.2', 'plotly': '6.5.2'}

1) What the test is for (and when to use it)#

What it tests#

The JB test is a hypothesis test of normality:

  • H₀ (null): the data are generated by a normal distribution

  • H₁ (alternative): the data are not normal

The test doesn’t try to match the entire shape of the distribution. It focuses on two features:

  • asymmetry (skewness)

  • tail heaviness / peakedness (kurtosis)

So the JB test is most informative when the key deviations you care about are skewness and/or tail behavior.

Where it shows up in practice#

Common use cases:

  • Residual diagnostics (linear regression, ARIMA, etc.)

  • Sanity checks before using methods that assume normality (or normal residuals)

  • Quick comparisons of “how non-normal” different samples look in terms of skewness and kurtosis

What it is not#

  • It is not a proof of normality. A large p-value means “insufficient evidence to reject normality”, not “the data are normal”.

  • It is not a comprehensive shape test: some non-normal distributions can have skewness near 0 and kurtosis near 3 (JB may miss them).

2) The Jarque–Bera statistic#

Let \(x_1,\dots,x_n\) be a sample.

Define the sample central moments

\[ m_k = \frac{1}{n}\sum_{i=1}^n (x_i-\bar{x})^k. \]

From these, define the (moment-based) sample skewness \(S\) and sample kurtosis \(K\):

\[ S = \frac{m_3}{m_2^{3/2}}, \qquad K = \frac{m_4}{m_2^2}. \]

For a normal distribution, \(S=0\) and \(K=3\).

The Jarque–Bera statistic combines deviations in both quantities:

\[ \mathrm{JB} = \frac{n}{6}\left(S^2 + \frac{(K-3)^2}{4}\right) = \underbrace{\frac{n}{6}S^2}_{\text{skew contribution}} + \underbrace{\frac{n}{24}(K-3)^2}_{\text{kurtosis contribution}}. \]

Asymptotic null distribution and p-value#

Under \(H_0\) (normality), as \(n \to \infty\):

\[ \mathrm{JB} \;\xrightarrow[]{d}\; \chi^2(2). \]

So the p-value is

\[ \text{p-value} = \mathbb{P}(\chi^2(2) \ge \mathrm{JB}). \]

For 2 degrees of freedom, the survival function has a simple closed form:

\[ \mathbb{P}(\chi^2(2) \ge x) = e^{-x/2}. \]

That means for JB we can compute the p-value with just np.exp (no SciPy).

def _to_1d_finite(x: np.ndarray) -> np.ndarray:
    x = np.asarray(x, dtype=float).reshape(-1)
    return x[np.isfinite(x)]


def central_moment(x: np.ndarray, k: int) -> float:
    # Central moment m_k = mean((x - mean(x))^k)
    if k < 0:
        raise ValueError("k must be non-negative")

    x = _to_1d_finite(x)
    if x.size == 0:
        return float("nan")
    if k == 0:
        return 1.0

    mean = float(x.mean())
    return float(np.mean((x - mean) ** k))


def sample_skewness_kurtosis(x: np.ndarray) -> tuple[float, float]:
    # Moment-based sample skewness S and kurtosis K (normal => K=3).
    x = _to_1d_finite(x)
    if x.size < 3:
        raise ValueError("Need at least 3 finite values")

    m2 = central_moment(x, 2)
    if not np.isfinite(m2) or m2 <= 0:
        return float("nan"), float("nan")

    m3 = central_moment(x, 3)
    m4 = central_moment(x, 4)

    s = m3 / (m2 ** 1.5)
    k = m4 / (m2**2)
    return float(s), float(k)


def jarque_bera_test(x: np.ndarray) -> dict:
    # Jarque–Bera normality test (asymptotic χ² with df=2).
    x = _to_1d_finite(x)
    n = int(x.size)
    if n < 3:
        raise ValueError("Need at least 3 finite values")

    s, k = sample_skewness_kurtosis(x)
    if not (np.isfinite(s) and np.isfinite(k)):
        return {
            "n": n,
            "statistic": float("nan"),
            "p_value": float("nan"),
            "skewness": float("nan"),
            "kurtosis": float("nan"),
            "excess_kurtosis": float("nan"),
        }

    jb = (n / 6.0) * (s**2 + ((k - 3.0) ** 2) / 4.0)

    # Under H0: JB ~ chi-square(df=2) asymptotically.
    # For df=2, survival function is exp(-x/2).
    p_value = float(np.exp(-0.5 * jb))

    return {
        "n": n,
        "statistic": float(jb),
        "p_value": p_value,
        "skewness": float(s),
        "kurtosis": float(k),
        "excess_kurtosis": float(k - 3.0),
        "skew_component": float((n / 6.0) * (s**2)),
        "kurtosis_component": float((n / 24.0) * ((k - 3.0) ** 2)),
    }
# Quick check on a few synthetic samples

samples = {
    "normal": rng.standard_normal(500),
    "t(df=3)": rng.standard_t(df=3, size=500),
    "exponential": rng.exponential(scale=1.0, size=500),
}

for name, x in samples.items():
    res = jarque_bera_test(x)
    print(
        f"{name:>12s} | JB={res['statistic']:.3f} | p={res['p_value']:.4f} | "
        f"skew={res['skewness']:.3f} | excess_kurt={res['excess_kurtosis']:.3f}"
    )
      normal | JB=0.783 | p=0.6762 | skew=0.093 | excess_kurt=-0.052
     t(df=3) | JB=2202.120 | p=0.0000 | skew=0.243 | excess_kurt=10.270
 exponential | JB=2584.447 | p=0.0000 | skew=2.379 | excess_kurt=10.071

3) How to interpret the result#

Pick a significance level \(\alpha\) (often 0.05).

  • If p-value ≤ α: reject \(H_0\) → the sample shows statistically significant evidence of non-normality (in skewness and/or kurtosis).

  • If p-value > α: fail to reject \(H_0\) → you do not have enough evidence (from skewness/kurtosis) to say it’s non-normal.

Two important interpretation details:

  1. JB is an asymptotic test. For small samples, the χ² approximation can be rough.

  2. With large n, tiny deviations become significant. In big datasets it’s common to reject normality even when the deviation is practically irrelevant.

So: always pair the p-value with effect size diagnostics (skewness, excess kurtosis) and plots.

def chi2_df2_pdf(x: np.ndarray) -> np.ndarray:
    x = np.asarray(x, dtype=float)
    out = np.zeros_like(x)
    mask = x >= 0
    out[mask] = 0.5 * np.exp(-0.5 * x[mask])
    return out


def chi2_df2_sf(x: np.ndarray) -> np.ndarray:
    x = np.asarray(x, dtype=float)
    out = np.ones_like(x)
    mask = x >= 0
    out[mask] = np.exp(-0.5 * x[mask])
    return out


def plot_jb_pvalue(jb: float, *, title: str = "JB vs χ²(2) (right-tail p-value)") -> go.Figure:
    xmax = float(max(20.0, jb * 1.5))
    xs = np.linspace(0.0, xmax, 600)

    fig = go.Figure()
    fig.add_trace(go.Scatter(x=xs, y=chi2_df2_pdf(xs), mode="lines", name="χ²(2) PDF"))

    xs_tail = xs[xs >= jb]
    fig.add_trace(
        go.Scatter(
            x=np.r_[xs_tail, xs_tail[::-1]],
            y=np.r_[chi2_df2_pdf(xs_tail), np.zeros_like(xs_tail)],
            fill="toself",
            fillcolor="rgba(31, 119, 180, 0.2)",
            line=dict(color="rgba(0,0,0,0)"),
            hoverinfo="skip",
            name="p-value area",
            showlegend=True,
        )
    )

    p = float(chi2_df2_sf(jb))
    fig.add_vline(x=jb, line_width=2, line_dash="dash", line_color="firebrick")
    fig.update_layout(
        title=f"{title}<br><sup>JB={jb:.3f}, p-value={p:.4f}</sup>",
        xaxis_title="x",
        yaxis_title="density",
        legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="left", x=0.0),
    )
    return fig


res_norm = jarque_bera_test(samples["normal"])
plot_jb_pvalue(res_norm["statistic"], title="Normal sample: JB null distribution")
# 4) The null distribution: simulated JB values under true normality

def simulate_jb_under_normal(*, n: int, m: int, rng: np.random.Generator) -> np.ndarray:
    jb_vals = np.empty(m, dtype=float)
    for i in range(m):
        x = rng.standard_normal(n)
        jb_vals[i] = jarque_bera_test(x)["statistic"]
    return jb_vals

n = 200
m = 5000
jb_vals = simulate_jb_under_normal(n=n, m=m, rng=rng)

xs = np.linspace(0.0, np.quantile(jb_vals, 0.995), 600)

alpha = 0.05
crit = float(-2.0 * np.log(alpha))  # df=2
rejection_rate = float(np.mean(jb_vals >= crit))

fig = go.Figure()
fig.add_trace(go.Histogram(x=jb_vals, nbinsx=60, histnorm="probability density", name="simulated JB"))
fig.add_trace(go.Scatter(x=xs, y=chi2_df2_pdf(xs), mode="lines", name="χ²(2) PDF"))
fig.add_vline(x=crit, line_dash="dash", line_color="firebrick")
fig.update_layout(
    title=(
        "JB under H₀ (true normality)" +
        f"<br><sup>n={n}, simulations={m}, α={alpha} ⇒ critical={crit:.3f}, empirical rejection≈{rejection_rate:.3f}</sup>"
    ),
    xaxis_title="JB statistic",
    yaxis_title="density",
)
fig

5) What JB is “measuring” (geometry in skewness/kurtosis space)#

Because

\[ \mathrm{JB} = \frac{n}{6}S^2 + \frac{n}{24}(K-3)^2, \]

JB is basically a scaled squared distance from the normal point \((S, K-3) = (0, 0)\).

  • Moving away from 0 skewness increases JB.

  • Moving away from 0 excess kurtosis increases JB.

  • The same p-value corresponds (approximately) to an ellipse in \((S, K-3)\) space.

The plot below shows this idea.

# A contour view of JB as a function of (skewness, excess kurtosis)

n_for_geometry = 200
s_grid = np.linspace(-1.2, 1.2, 241)
e_grid = np.linspace(-2.5, 2.5, 241)  # excess kurtosis
S, E = np.meshgrid(s_grid, e_grid)
JB_grid = (n_for_geometry / 6.0) * (S**2 + (E**2) / 4.0)

alpha_levels = [0.10, 0.05, 0.01]
crit_levels = {a: float(-2.0 * np.log(a)) for a in alpha_levels}  # df=2

fig = go.Figure()
fig.add_trace(
    go.Contour(
        x=s_grid,
        y=e_grid,
        z=JB_grid,
        contours=dict(coloring="lines", showlabels=True),
        line=dict(width=1),
        showscale=False,
        hovertemplate="skew=%{x:.3f}<br>excess kurt=%{y:.3f}<br>JB≈%{z:.3f}<extra></extra>",
    )
)

for a, q in crit_levels.items():
    rhs = (6.0 * q) / n_for_geometry
    s_line = np.linspace(-np.sqrt(rhs), np.sqrt(rhs), 400)
    e_line = 2.0 * np.sqrt(np.maximum(0.0, rhs - s_line**2))
    fig.add_trace(go.Scatter(x=s_line, y=e_line, mode="lines", name=f"α={a} boundary"))
    fig.add_trace(go.Scatter(x=s_line, y=-e_line, mode="lines", showlegend=False))

pts = []
for name, x in samples.items():
    r = jarque_bera_test(x)
    pts.append((name, r["skewness"], r["excess_kurtosis"]))

fig.add_trace(
    go.Scatter(
        x=[p[1] for p in pts],
        y=[p[2] for p in pts],
        mode="markers+text",
        text=[p[0] for p in pts],
        textposition="top center",
        marker=dict(size=10, color="firebrick"),
        name="example samples",
        hovertemplate="%{text}<br>skew=%{x:.3f}<br>excess kurt=%{y:.3f}<extra></extra>",
    )
)

fig.update_layout(
    title=(
        "JB statistic as a function of skewness and excess kurtosis"
        + f"<br><sup>n={n_for_geometry}; JB contours + approximate rejection boundaries</sup>"
    ),
    xaxis_title="skewness S",
    yaxis_title="excess kurtosis (K − 3)",
)
fig

6) Visual intuition: histogram + Q–Q plot#

A good workflow is:

  1. Run JB (get p-value, skewness, excess kurtosis)

  2. Inspect a standardized histogram with a normal overlay

  3. Inspect a Q–Q plot (systematic curvature indicates non-normal tails; S-shaped patterns indicate skew)

Below we compare three distributions after z-scoring (mean 0, std 1).

def norm_pdf(x: np.ndarray) -> np.ndarray:
    x = np.asarray(x, dtype=float)
    return (1.0 / np.sqrt(2.0 * np.pi)) * np.exp(-0.5 * x**2)


def norm_ppf(p: np.ndarray) -> np.ndarray:
    # Approximate inverse CDF of the standard normal distribution.
    # Vectorized rational approximation (Acklam). Returns NaN for p outside (0, 1).

    p = np.asarray(p, dtype=float)
    x = np.full_like(p, np.nan, dtype=float)

    a = np.array(
        [
            -39.69683028665376,
            220.9460984245205,
            -275.9285104469687,
            138.3577518672690,
            -30.66479806614716,
            2.506628277459239,
        ]
    )
    b = np.array(
        [
            -54.47609879822406,
            161.5858368580409,
            -155.6989798598866,
            66.80131188771972,
            -13.28068155288572,
        ]
    )
    c = np.array(
        [
            -0.007784894002430293,
            -0.3223964580411365,
            -2.400758277161838,
            -2.549732539343734,
            4.374664141464968,
            2.938163982698783,
        ]
    )
    d = np.array(
        [
            0.007784695709041462,
            0.3224671290700398,
            2.445134137142996,
            3.754408661907416,
        ]
    )

    plow = 0.02425
    phigh = 1.0 - plow

    valid = (p > 0.0) & (p < 1.0)

    mask = valid & (p < plow)
    if np.any(mask):
        q = np.sqrt(-2.0 * np.log(p[mask]))
        num = (((((c[0] * q + c[1]) * q + c[2]) * q + c[3]) * q + c[4]) * q + c[5])
        den = ((((d[0] * q + d[1]) * q + d[2]) * q + d[3]) * q + 1.0)
        x[mask] = -(num / den)

    mask = valid & (p >= plow) & (p <= phigh)
    if np.any(mask):
        q = p[mask] - 0.5
        r = q * q
        num = (((((a[0] * r + a[1]) * r + a[2]) * r + a[3]) * r + a[4]) * r + a[5]) * q
        den = (((((b[0] * r + b[1]) * r + b[2]) * r + b[3]) * r + b[4]) * r + 1.0)
        x[mask] = num / den

    mask = valid & (p > phigh)
    if np.any(mask):
        q = np.sqrt(-2.0 * np.log(1.0 - p[mask]))
        num = (((((c[0] * q + c[1]) * q + c[2]) * q + c[3]) * q + c[4]) * q + c[5])
        den = ((((d[0] * q + d[1]) * q + d[2]) * q + d[3]) * q + 1.0)
        x[mask] = num / den

    return x


def zscore(x: np.ndarray) -> np.ndarray:
    x = _to_1d_finite(x)
    mu = x.mean()
    sigma = x.std(ddof=0)
    return (x - mu) / sigma


labels = list(samples.keys())
z_samples = {k: zscore(v) for k, v in samples.items()}
results = {k: jarque_bera_test(v) for k, v in z_samples.items()}

subplot_titles = [
    f"{k}<br><sup>JB={results[k]['statistic']:.2f}, p={results[k]['p_value']:.4f}</sup>" for k in labels
]

fig = make_subplots(
    rows=2,
    cols=3,
    subplot_titles=subplot_titles * 2,
    vertical_spacing=0.12,
    row_heights=[0.55, 0.45],
)

x_grid = np.linspace(-4.0, 4.0, 500)

for j, k in enumerate(labels, start=1):
    z = z_samples[k]

    fig.add_trace(
        go.Histogram(x=z, nbinsx=45, histnorm="probability density", name=f"{k} hist", showlegend=False),
        row=1,
        col=j,
    )
    fig.add_trace(
        go.Scatter(
            x=x_grid,
            y=norm_pdf(x_grid),
            mode="lines",
            line=dict(color="black"),
            name="N(0,1)",
            showlegend=(j == 1),
        ),
        row=1,
        col=j,
    )

    z_sorted = np.sort(z)
    n = z_sorted.size
    p = (np.arange(1, n + 1) - 0.5) / n
    q = norm_ppf(p)

    fig.add_trace(
        go.Scatter(x=q, y=z_sorted, mode="markers", marker=dict(size=4), name=f"{k} QQ", showlegend=False),
        row=2,
        col=j,
    )
    fig.add_trace(
        go.Scatter(
            x=[q.min(), q.max()],
            y=[q.min(), q.max()],
            mode="lines",
            line=dict(color="firebrick", dash="dash"),
            name="y=x",
            showlegend=(j == 1),
        ),
        row=2,
        col=j,
    )

for j in range(1, 4):
    fig.update_xaxes(title_text="z", row=1, col=j)
    fig.update_yaxes(title_text="density", row=1, col=j)

    fig.update_xaxes(title_text="theoretical quantiles (N(0,1))", row=2, col=j)
    fig.update_yaxes(title_text="sample quantiles (z-scored)", row=2, col=j)

fig.update_layout(title="Distributions (z-scored): histogram + Q–Q plot", height=850)
fig

7) Practical notes & pitfalls#

  • Asymptotic nature: JB relies on a large-sample χ² approximation. For small \(n\), use JB as a hint and rely more on plots or small-sample tests.

  • Outliers: skewness and kurtosis are highly sensitive to outliers → a single extreme value can drive rejection.

  • Power and misspecification: JB is tuned to skewness/kurtosis. It can have low power against other deviations (e.g., some symmetric mixtures).

  • Big n: with enough data, any small deviation becomes statistically significant. Always check whether non-normality is practically important.

A good “diagnostics bundle” is: JB p-value + (skewness, excess kurtosis) + histogram + Q–Q plot.

Exercises#

  1. Simulate JB rejection rates under \(H_0\) for different sample sizes (\(n=20,50,200,1000\)). How quickly does the χ² approximation settle?

  2. Build two non-normal samples with skewness near 0 and kurtosis near 3 (e.g. symmetric mixtures) and see whether JB detects them.

  3. Apply JB to regression residuals from a simple linear model with (a) Gaussian noise and (b) heavy-tailed noise. Compare with Q–Q plots.

References#

  • Jarque, C. M., & Bera, A. K. (1980). Efficient tests for normality, homoscedasticity and serial independence of regression residuals.

  • Jarque, C. M., & Bera, A. K. (1987). A test for normality of observations and regression residuals.